home *** CD-ROM | disk | FTP | other *** search
- /*
- * m c . c
- *
- * Multi-column filter
- *
- */
-
- #ifdef DOCUMENTATION
-
- title mc Convert One or More Files To Multi-column Format
- index Convert a file to multi-column format
- index Combine files in multi-column format
-
- Synopsis
-
- mc [-t] [-c columns] [-h height] [-g gutter] [-w width]
- [filespec | aliasspec]...
-
- Description
-
- mc reads one or more files and converts them to a single multi-column
- file that it writes to the standard output.
-
- Each line of each input file will occupy one row-column position in
- the output file. If an item is too wide for its column, it is
- truncated with no message. The items from any one file are placed in
- order going down a column, and then across columns. Thus, consider a
- simple case first:
-
- .tp 11
- mc a -c 2 -h 10
-
- produces a file whose first page looks like this:
-
- a1 a11
- a2 a12
- : :
- : :
- a10 a20
-
- ("-c 2" requests two columns; "-h 10" requests ten lines per page).
-
- File Specifications
-
- The mc command line may contain more than one file specification;
- however, the specifications may not be wild-carded.
-
- When there are multiple files, each column will come from a single
- file:
-
- .tp 9
- mc a b -c 2 -h 10
-
- produces:
-
- a1 b1
- a2 b2
- : :
- : :
- a10 b10
-
- Columns are filled by consecutive files, in rotating order:
-
- .tp 9
- mc a b -c 4 -h 10
-
- produces:
-
- a1 b1 a11 b11
- a2 b2 a12 b12
- : : : :
- : : : :
- a10 b10 a20 b20
-
- If there is more than one file, and the number of columns does not
- evenly divide the number of files, successive pages will be different.
-
- If a file reaches EOF while there is still data to be read from other
- files, the ended file's columns will be blank from that point on.
-
- The use of "-" as a file specification causes the standard input to
- be read. If mc is invoked with no file arguments at all, it reads
- the standard input file once.
-
- Alias Specifications
-
- Alias specifications provide a method for controlling the placement
- of file data. An alias specification is a reference to another file
- (or alias) specification. File and alias specifications are numbered,
- starting at one for the left-most such specification; switches and
- their arguments do not affect the numbering. The alias specification
- #n indicates that the n'th specification is to be repeated. Such a
- specification is legal only if it refers to an earlier specification;
- i.e. #n is only legal as the n+1'st, n+2'nd, etc. specification.
- Thus:
-
- .tp 9
- mc a #1 b #3 -c 4 -h 10
-
- produces:
-
- a1 a11 b1 b11
- a2 a12 b2 b12
- : : : :
- : : : :
- a10 a20 b10 b20
-
- This should be compared with:
-
- mc a a b b -c 4 -h 10
-
- which opens each of a and b twice and reads the copies in parallel,
- placing two copies of each item on the page.
-
- Last-page Handling
-
- mc will attempt to make the columns on the last page of output as
- close in length as possible, rather than simply filling some columns
- all the way to the bottom and leaving others empty. This special
- handling is enabled only when mc is given no more than one file
- specification.
-
- Switches
-
- The following switches are available:
- .lm +8;
-
- -c Next argument is the number of columns
- -h Next argument is the height, in lines, of a page
- -g Next argument is the gutter width (the space between columns)
- -w Next argument is the width, in characters, of a page
- -t Terminal mode; sets default height to 23, default width to 80, and, if
- stdout is your terminal, pauses after each page
- -d Debug (conditionally-compiled code)
-
- .lm -8;
-
- Defaults
-
- Height 58, width 132, no terminal mode; note that -t alters all three.
- Gutter 1, max(number-of-file-and-alias-specs,2) columns.
-
- Control Character Handling
-
- mc is designed to operate on text files, not binary files. It sets
- columns up based on how they will look when displayed. Hence, it
- processes control characters (anything that isprint() returns FALSE
- to) as follows:
-
- <TAB> ("\t") is expanded into the equivalent number of spaces.
- mc always assumes that there are tab stops every 8 character
- positions.
-
- <BS> ("\b") subtracts one from the current cursor position.
- Any printable characters received when the current cursor
- position is over a previously read character is ignored -
- i.e. overstruck combinations retain only the first character
- read. However, if the character being overstruck is a space
- or an underscore ("_"), the overstriking character replaces
- the current character.
-
- <BS> when the current cursor position is over the first
- character in the line is ignored.
-
- <CR> ("\r") resets the current cursor position to the first
- character of the line.
-
- <LF> ("\n") ends the current input item.
-
- All other control characters are discarded.
-
- File Limits
-
- Different systems impose different limits on the number of files
- mc may simultaneously have open.
-
- On RT-based systems, this limit is totally dynamic; opening too many
- files is most likely to cause an error due to insufficient memory,
- rather than a file system error per se.
-
- On RSX-based systems, the absolute limit is set at task-build time.
- The distributed source, when built with BUILD, will allow for 10
- open files. Note that this total includes at least one file for
- stdin and stdout (two if either is redirected). Of course, it is
- possible to run into memory limitations even under this limit.
-
- Aliases do not count against this limit, since they refer to
- already-opened files. Similarly, a "-" argument, implying stdin,
- does not count, as it is simply a reference to the already-open
- standard input. If you are right at the limit, be sure to use
- the standard input as one of your files; you are paying for it
- to be open whether you use it or not.
-
- mc itself imposes another limit, which does include aliases. In the
- distributed code, this limit is set to a total of 20 file and alias
- arguments.
-
- Other Limits
-
- No column can contain more than 256 characters (compile-time
- constant).
-
- Diagnostics
-
- Insufficient memory - sorry
-
- Too many file and alias arguments
-
- .tp 2
- Unreasonable -c/-g/-w combination
- -- for example, c > w
-
- .tp 2
- <value>: Bad <what> specification
- -- Invalid value for something like -h
-
- <filespec>: Can't open: <why>
-
- Suggested Improvements
-
- Anyone interested in improving this program might want to consider
- the following suggestions. Be warned that they are not as easy to
- implement as they look!
-
- Make the last-page cleanup algorithm work for the multiple-files case.
-
- Add the ability to fold long items to the next entry for this file
- (probably indented) rather than just chopping them off. (This only
- gets hard when you consider both overstrike handling and multiple
- files!)
-
- Allow the automatic printing of the file name above the appropriate
- columns at the top of each page.
-
- The techniques used are wasteful of space; in particular, the
- gutters should not be taking up space in the data array!
-
- Bugs
-
- Author
-
- Original author unknown; extensively modified by Jerry Leichter
-
- #endif
-
- char *documentation[] = {
- " mc [-t] [-c columns] [-h height] [-g gutter] [-w width]",
- " [filespec | aliasspec]...",
- "",
- "mc reads one or more files and converts them to a single multi-column file",
- "that it writes to the standard output. ",
- "",
- "The mc command line may contain more than one file specification; however,",
- "the specifications may not be wild-carded.",
- "",
- "Each line of each input file will occupy one row-column position in the output"
- ,
- "file. If an item is too wide for its column, it is truncated with no message."
- ,
- "",
- "The items from any one file are placed in order going down a column, and then",
- "across columns. Columns are filled by consecutive files, in rotating order.",
- "",
- "The use of \"-\" as a file specification causes the standard input to be read."
- ,
- "If mc is invoked with no file arguments at all, it reads the standard input",
- "file once.",
- "",
- "Alias specifications provide a method for controlling the placement of file",
- "data. An alias specification is a reference to another file (or alias)",
- "specification. File and alias specifications are numbered, starting at one",
- "for the left-most such specification; switches and their arguments do not",
- "affect the numbering. The alias specification #n indicates that the n'th",
- "specification is to be repeated. Such a specification is legal only if it",
- "refers to an earlier specification; i.e. #n is only legal as the n+1'st,",
- "n+2'nd, etc. specification.",
- "",
- "mc will attempt to make the columns on the last page of output as close in",
- "length as possible, rather than simply filling some columns all the way to",
- "the bottom and leaving others empty. This special handling is enabled only",
- "when mc is given no more than one file specification.",
- "",
- "The following switches are available:",
- " ",
- " -c Next argument is the number of columns",
- " -h Next argument is the height, in lines, of a page",
- " -g Next argument is the gutter width (the space between columns)",
- " -w Next argument is the width, in characters, of a page",
- " -t Terminal mode; sets default height to 23, default width to 80,",
- " and, if stdout is your terminal, pauses after each page",
- "",
- "The default values are:",
- "",
- " Height 58, width 132, no terminal mode; note that -t alters all three.",
- " Gutter 1, max(number-of-file-and-alias-specs,2) columns.",
- 0 };
-
- #include <ctype.h>
- #include <stdio.h>
- #include <malloc.h>
- #include <string.h>
- #include <io.h>
-
- #define exits(x) exit(x)
- #define IO_ERROR 2
- #define FALSE 0
- #define TRUE 1
- #define EOS 0
-
- /*
- * Turn on to include debugging code
- */
-
- #define LINEMAX 256 /* Maximum line length handled */
- /* (also maximum column width) */
- #define NFILES 20 /* Maximum files (including */
- /* aliased files) */
- #define ALIAS '#' /* Marks an alias argument */
- /* (Can't be "-") */
- int debug = 0;
- int columns = -1; /* All these will be given */
- int gutter = -1; /* default values later unless */
- int height = -1; /* the user sets them first */
- int width = -1; /* (to a positive value!) */
-
- int pause = FALSE; /* Pause-at-end of page flag */
- int first = TRUE; /* First-time-through flag */
- int cwidth; /* Total (column+gutter) width */
- int pagesize; /* Total bytes in page */
- int nf; /* Number of file & alias specs */
- int files = 0; /* Number of files still open */
- int aliases = 0; /* Number of alias specs */
-
- FILE *file[NFILES]; /* File pointers for our files */
- char line[LINEMAX]; /* Input line buffer */
- int linelen; /* Length of a line in line[] */
- int lineend; /* Last usable line[] position */
- char *page; /* -> page paste-up matrix */
- char *copy();
-
- main(argc,argv)
- int argc;
- char *argv[];
- {
- register char *p;
- register int c,i;
- int n;
- FILE *fp;
-
- if (argc == 2 && argv[1][0] == '?' && strlen(argv[1]) == 1)
- {
- help();
- return;
- }
-
- nf = argc - 1;
- for (i = 1; i < argc; i++) {
- p = argv[i];
- if (*p == '-') {
- if (p[1] == '\0') /* stdin as a file */
- continue; /* skip this one */
-
- argv[i] = 0;
- --nf;
- for (++p; c = *p++;)
- switch(tolower(c))
- {
- case 'd':
- debug++;
- break;
- case 'c':
- if (++i >= argc)
- usage();
- columns = atoi(argv[i]);
- if (columns <= 0)
- bad(argv[i],"columns");
- argv[i] = 0;
- --nf;
- break;
-
- case 'g':
- if (++i >= argc)
- usage();
- gutter = atoi(argv[i]);
- if (gutter <= 0)
- bad(argv[i],"gutter");
- argv[i] = 0;
- --nf;
- break;
-
- case 'h':
- if (++i >= argc)
- usage();
- height = atoi(argv[i]);
- if (height <= 0)
- bad(argv[i],"height");
- argv[i] = 0;
- --nf;
- break;
-
- case 't':
- if (height<0)
- height = 23;
- if (width<0)
- width = 80;
- if (isatty(fileno(stdout)))
- pause++;
- break;
-
- case 'w':
- if (++i >= argc)
- usage();
- width = atoi(argv[i]);
- if (width <= 0)
- bad(argv[i],"width");
- argv[i] = 0;
- --nf;
- break;
-
- default:
- usage();
- break;
- }
- }
- }
-
- if (nf > NFILES)
- error("Too many file and alias arguments");
-
- if (nf == 0)
- {
- nf = 1; /* Run as a filter */
- file[files++] = stdin;
- }
- else
- for (i = 1; i < argc; i++)
- if (p = argv[i])
- switch (*p)
- {
- case '-': /* stdin as a file */
- file[files++] = stdin;
- break;
-
- case ALIAS:
- n = atoi(&p[1]) - 1;
- if (n < 0 || n >= files)
- error(
- "\"%s\": bad alias specification - no such file\n",p
- );
- file[files++] = file[n];
- aliases++;
- break;
-
- default:
- if ((fp = fopen(p,"r")) == NULL)
- {
- fprintf(stderr,
- "%s: ",p);
- perror("Can't open");
- exits(IO_ERROR);
- }
- file[files++] = fp;
- break;
- }
-
- files -= aliases; /* Aliases aren't open */
-
- /*
- * Establish defaults for any parameters the user didn't set
- */
- if (width < 0)
- width = 132;
- if (gutter < 0)
- gutter = 1;
- if (height < 0)
- height = 58;
- if (columns < 0)
- if (nf > 1)
- columns = nf;
- else
- columns = 2;
-
- /*
- * The last column isn't followed by a gutter, but dealing with this makes
- * the computation too complex; so we simply pretend the page is wider, which
- * is ok since the code trims the trailing spaces that would go there anyway.
- * This is, of course, quite wasteful of space, but then so is the whole algo-
- * rithm; we shouldn't be storing ANY of the gutters explicitly.
- */
- width += gutter;
- cwidth = width/columns;
-
- if (cwidth <= gutter || (cwidth - gutter) > LINEMAX)
- error("Unreasonable -c/-g/-w combination\n");
-
- lineend = (int)line + (cwidth - gutter);
-
- page = malloc(pagesize = height * width);
- if (page == NULL)
- error("Insufficient memory - sorry\n");
-
- if (debug)
- {
- fprintf(stderr, "width %d, height %d, columns %d, cwidth %d\n", width, height, columns, cwidth);
- fprintf(stderr, "\tgutter %d, pause %d, pagesize %d, page at 0%o\n", gutter, pause, pagesize, page);
- fprintf(stderr,"%d files(%d real + %d aliases)\n", nf, files, aliases);
- }
-
- process();
- free(page);
- }
-
- /*
- * Process all the data
- */
- process()
- {
- register int offset; /* Offset into page */
- register int items; /* Counts items added */
- register int maxitems; /* Room for this many */
- int curfile; /* Current file */
-
- maxitems = columns * height;
- blank();
- curfile = items = offset = 0;
- while (get(file[curfile]))
- {
- if (items >= maxitems)
- {
- output(items);
- blank();
- items = offset = 0;
- }
- if (debug)
- fprintf(stderr,"Inserting %s at offset %d, file %d\n", line, offset, curfile);
- copy(page+offset,line,linelen);
- items++;
- if ((items % height) == 0) /* Bottom of a column */
- {
- curfile = (curfile + 1) % nf;
- if (debug)
- fprintf(stderr,"Switching to file %d of %d\n", curfile,files);
- }
- offset += cwidth;
- }
- output(items);
- }
-
- /*
- * Print out the buffered page, which has been filled with items items.
- */
- output(items)
- int items; /* # of items the caller used */
- {
- int nrows;
- register int i,col,row;
-
- if (debug)
- fprintf(stderr,"output(%d)\n",items);
-
- if (items <= 0) /* Nothin' to do */
- return;
-
- /*
- * Get number of rows we'll need. This is the basis of the "last page"
- * optimization - we don't use all the rows, just enough to hold everything
- * (items/columns, rounded up). If there's more than are one file, just use
- * the whole page.
- */
- if (nf == 1)
- nrows = (items + (columns - 1)) / columns;
- else
- nrows = height;
-
- if (debug)
- {
- fprintf(stderr,"items %d, nrows %d\n",items,nrows);
- page[pagesize] = 0;
- fprintf(stderr,"Dump of page:\n%s\n",page);
- }
-
- if (first)
- first = FALSE;
- else
- {
- if (pause)
- {
- printf("\t\t\t Type CTRL/Z to exit, any other key to continue...");
- fflush(stdout);
- i = kbhit();
- putchar('\n');
- if (i == 26) /* CTRL/Z */
- exit();
- }
- putchar('\f');
- }
-
- /*
- * Scan through page[] row-wise, after having filled it column-wise. (Page[]
- * is laid out column-wise in memory.)
- */
- for (row = 0; row < nrows; row++)
- {
- for (col = 0; col < columns; col++)
- putitem(page+(row+col*nrows)*cwidth,
- (col == columns - 1));
- putchar('\n');
- }
- }
-
- /*
- * Put out one item, possibly trimming trailing spaces
- */
- putitem(base,trim)
- register char *base; /* First char to put */
- int trim; /* Trim trailing spaces */
- {
- register char *end; /* End of item */
-
- end = &base[cwidth - 1];
- if (trim)
- while (*end == ' ')
- --end;
-
- while (base <= end)
- putchar(*base++);
- }
-
- /*
- * Blank out page[]
- */
- blank()
- {
- register int n;
-
- for (n = 0; n < pagesize;)
- page[n++] = ' ';
- }
-
- /*
- * Fill line[]; return FALSE when all files have reached EOF, TRUE until then.
- */
- get(fp)
- FILE *fp;
- {
- register char *p; /* Current char pos */
- register char *high; /* Char pos high water */
- register int c; /* Character */
-
- if (feof(fp))
- {
- linelen = 0; /* Pretend we read "" */
- return(TRUE);
- }
-
- high = p = line;
- while ((c = getc(fp)) != EOF && c != '\n')
- switch(c)
- {
- case '\b':
- if (p > line)
- --p;
- break;
-
- case '\r':
- p = line;
- break;
-
- case '\t':
- if (((p - line) & 07) != 07)
- ungetc(c,fp);
- c = ' ';
- /*
- * Fall through...
- */
- default:
- if (isprint(c))
- {
- if (((int)p < lineend) && (p == high || *p == '_' || *p == ' '))
- {
- *p = c;
- if (p == high)
- high++;
- }
- p++;
- }
- break;
- }
-
- linelen = high - line;
-
- if (c != EOF)
- return(TRUE);
- else
- return((--files != 0));
- }
-
- bad(v,s)
- char *v;
- char *s;
- {
- error("\"%s\": bad %s specification\n",v,s);
- }
-
- usage()
- {
- fprintf(stderr,"Usage:\n mc [-t] [-c columns] [-g gutter] ");
- fprintf(stderr,"[-h height] [-w width] [file | #n]...\n");
- error("mc ? for help");
- }
-
- help()
- /*
- * Give good help
- */
- {
- register char **dp;
-
- for (dp = documentation; *dp; dp++)
- printf("%s\n",*dp);
- }
-
- /*
- * c o p y . c
- */
-
- #ifdef DOCUMENTATION
-
- title copy Copy a Given Number of Bytes
- index Copy a given number of bytes
-
- synopsis
- .s.nf
- char *
- copy(out, in, nbytes)
- char *out; /* Output vector */
- char *in; /* Input vector */
- unsigned int count; /* Bytes to copy */
- .s.f
- Description
-
- Copy the indicated number of bytes from the input area
- to the output area. Return a pointer to the first free
- byte in the output area. (I.e., &out[count]).
-
- The copying will be faster if out and in are either both
- even or both odd addresses.
-
- #endif
-
- #define SHIFT 1
- #define LOWBIT 01
-
- char *copy(out, in, count)
- register char *out;
- register char *in;
- register unsigned int count;
- /*
- * Copy a given number of bytes
- */
- {
- if (count != 0)
- {
- #ifdef SHIFT
- if (count > 10)
- {
- /*
- * Try to optimize
- */
- if ((((unsigned int) in) & LOWBIT) != 0)
- {
- *out++ = *in++;
- count--;
- }
- if ((((unsigned int) out) & LOWBIT) == 0)
- {
- count >>= SHIFT; /* Get a word count */
- do
- {
- *((int *)out)++ = *((int *)in)++;
- } while (--count != 0);
- goto exit;
- }
- }
- #endif
- /*
- * Here for small copies, strange machines, and copies where
- * the output buffer isn't the same parity as the input buffer.
- */
- do
- {
- *out++ = *in++;
- } while (--count != 0);
- }
- exit: return (out);
- }
-
- /*
- * e r r o r . c
- */
-
- #ifdef DOCUMENTATION
-
- title error Fatal Error Exit
- index Fatal error exit
-
- synopsis
- .s.nf
- _error()
-
- error(format)
- char *format;
- .s.f
- documentation
-
- Fatal error exits. _error() halts, error() prints something
- on stderr and then halts.
-
- bugs
-
- #endif
-
- #include <stdarg.h>
-
- char recycle = 0; /* Prevents looping */
-
- error(format)
- char *format;
- /*
- * Error message before retiring.
- */
- {
- va_list args;
-
- va_start(args, format);
- if (recycle++ == 0) {
- vfprintf(stderr, format, args);
- }
- va_end(args);
- _error();
- }
-
- _error()
- {
- abort();
- }
-